17 research outputs found

    Boosting XML Filtering with a Scalable FPGA-based Architecture

    Full text link
    The growing amount of XML encoded data exchanged over the Internet increases the importance of XML based publish-subscribe (pub-sub) and content based routing systems. The input in such systems typically consists of a stream of XML documents and a set of user subscriptions expressed as XML queries. The pub-sub system then filters the published documents and passes them to the subscribers. Pub-sub systems are characterized by very high input ratios, therefore the processing time is critical. In this paper we propose a "pure hardware" based solution, which utilizes XPath query blocks on FPGA to solve the filtering problem. By utilizing the high throughput that an FPGA provides for parallel processing, our approach achieves drastically better throughput than the existing software or mixed (hardware/software) architectures. The XPath queries (subscriptions) are translated to regular expressions which are then mapped to FPGA devices. By introducing stacks within the FPGA we are able to express and process a wide range of path queries very efficiently, on a scalable environment. Moreover, the fact that the parser and the filter processing are performed on the same FPGA chip, eliminates expensive communication costs (that a multi-core system would need) thus enabling very fast and efficient pipelining. Our experimental evaluation reveals more than one order of magnitude improvement compared to traditional pub/sub systems.Comment: CIDR 200

    Querying Spatio-temporal Patterns in Mobile Phone-Call Databases

    Full text link
    Abstract — Call Detail Record (CDR) databases contain millions of records with information about cell phone calls, including the position of the user when the call was made/received. This huge amount of spatiotemporal data opens the door for the study of human trajectories on a large scale without the bias that other sources (like GPS or WLAN networks) introduce in the population studied. Also, it provides a platform for the development of a wide variety of studies ranging from the spread of diseases to planning of public transport. Nevertheless, previous work on spatiotemporal queries does not provide a framework flexible enough for expressing the complexity of human trajectories. In this paper we present the Spatiotemporal Pattern System (STPS) to query spatiotemporal patterns in very large CDR databases. STPS defines a regular-expression query language that is intuitive and that allows for any combination of spatial and temporal predicates with constraints, including the use of variables. The design of the language took into consideration the layout of the areas being covered by the cellular towers, as well as “areas ” that label places of interested (e.g. neighborhoods, parks, etc) and topological operators. STPS includes an underlying indexing structure and algorithms for query processing using different evaluation strategies. A full implementation of the STPS is currently running with real, very large CDR databases on Telefónica Research Labs. An extensive performance evaluation of the STPS shows that it can efficiently find complex mobility patterns in large CDR databases. I

    Abstract Efficient Trajectory Joins using Symbolic Representations

    No full text
    Efficiently and accurately discovering similarities among moving object trajectories is a difficult problem that appears in many spatiotemporal applications. In this paper we consider how to efficiently evaluate trajectory joins, i.e., how to identify all pairs of similar trajectories between two datasets. Our approach represents an object trajectory as a sequence of symbols (i.e., a string). Based on special lower-bounding distances between two strings, we propose a pruning heuristic for reducing the number of trajectory pairs that need to be examined. Furthermore, we present an indexing scheme designed to support efficient evaluation of string similarities in secondary storage. Through a comprehensive experimental evaluation we present the advantages of the proposed techniques

    Querying trajectories using flexible patterns

    No full text
    The wide adaptation of GPS and cellular technologies has created many applications that collect and maintain large repositories of data in the form of trajectories. Previous work on querying/analyzing trajectorial data typically falls into methods that either address spatial range and NN queries, or, similarity based queries. Nevertheless, trajectories are complex objects whose behavior over time and space can be better captured as a sequence of interesting events. We thus facilitate the use of motion “pattern ” queries which allow the user to select trajectories based on specific motion patterns. Such patterns are described as regular expressions over a spatial alphabet that can be implicitly or explicitly anchored to the time domain. Moreover, we are interested in “flexible ” patterns that allow the user to include “variables” in the query pattern and thus greatly increase its expressive power. In this paper we introduce a framework for efficient processing of flexible pattern queries. The framework includes an underlying indexing structure and algorithms for query processing using different evaluation strategies. An extensive performance evaluation of this framework shows significant performance improvement when compared to existing solutions. 1

    ABSTRACT Time Relaxed Spatiotemporal Trajectory Joins

    No full text
    Many spatiotemporal applications store moving object data in the form of trajectories. Various recent works have addressed interesting queries on trajectorial data, mainly focusing on range queries and Nearest Neighbor queries. Here we examine another interesting query, the Time Relaxed Spatiotemporal Trajectory Join (TRSTJ) which effectively finds groups of moving objects that have followed similar movements in different times. We first attempt to address the TRSTJ problem using a symbolic representation algorithm, which we have recently proposed for trajectory joins. However we show experimentally that this solution produces false positives that grow rapidly with the increase of the problem size. As a result, it is inefficient for TRSTJ queries as it leads to large query time overhead. In order to improve query performance, we propose two important heuristics that turn the symbolic represenation approach effective for TRSTJ queries. Our first improvement, allows the use of multiple origins when processing strings representing trajectories. The experimental evaluation shows that the multipleorigin approach drastically reduces query performance. We then present a “divide and conquer ” approach to further reduce false positives through symbolic class separation. The proposed solutions can be combined together, which leads to even better query performance. We present an experimental study revealing the advantages of using these approache

    FlexTrack: a System for Querying Flexible Patterns in Trajectory Databases

    No full text
    Abstract. We describe the FlexTrack system for querying trajectories using flexible pattern queries. Such queries are composed of a sequence of simple spatio-temporal predicates, e.g., range and nearest-neighbors, as well as complex motion pattern predicates, e.g., predicates that contain variables and constraints. Users can interactively select spatio-temporal predicates to construct such pattern queries using a hierarchy of regions that partition the spatial domain. Several different query processing algorithms are currently implemented and available in the FlexTrack system